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Abstract 

The development of microsatellite loci has become more efficient using next- 
generation sequencing (NGS) approaches, and many studies imply that the 
amount of applicable loci is large. However, few studies have sought to quantify 
the number of loci that are retained for use out of the thousands of sequence 
reads initially obtained. We analyzed the success rate of microsatellite loci 
development for three amphibian species using a 454 NGS approach on tetra- 
nucleotide motif-enriched species-specific libraries. The number of sequence 
reads obtained differed strongly between species and ranged from 19,562 for 
Triturus cristatus to 55,626 for Lissotriton helveticus, with 52,075 reads obtained 
for Calotriton asper. PHOBOS was used to identify sequences with tetra-nucleo- 
tide repeat motifs with a minimum repeat number of ten and high quality 
primer binding sites. Of 107 sequences for T. cristatus, 316 for C. asper and 319 
for L. helveticus, we tested the amplification success, polymorphism, and degree 
of heterozygosity for 41 primer combinations each for C. asper and T. cristatus, 
and 22 for L. helveticus. We found 11 polymorphic loci for T. cristatus, 20 loci 
for C. asper, and 15 loci for L. helveticus. Extrapolated, the number of poten- 
tially amplifiable loci (PALs) resulted in estimated species-specific success rates 
of 0.15% (T. cristatus), 0.30% (C. asper), and 0.39% (L. helveticus). Compared 
with representative Illumina NGS approaches, our applied 454-sequencing 
approach on specifically enriched sublibraries proved to be quite competitive in 
terms of success rates and number of finally applicable loci. 



Received: 2 July 2013; Revised: 7 August 
2013; Accepted: 12 August 2013 

Ecology and Evolution 2013; 3(11): 3947- 
3957 



doi: 10.1 002/ece3. 764 



© 2013 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd. 

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, 
distribution and reproduction in any medium, provided the original work is properly cited. 



3947 



NGS Based Microsatellite Loci Development 



A. Drechsler ef al. 



Introduction 

Microsatellite loci are still considered valuable tools for 
addressing basic questions in ecology, evolution, and 
behavior in nonmodel organisms, despite the fact that 
other molecular markers have become increasingly popu- 
lar due to next-generation sequencing (NGS) approaches 
(e.g., genotyping or sequencing of single-nucleotide poly- 
morphisms (SNPs)). Microsatellite loci are currently still 
the marker of choice for comprehensive analyses of popu- 
lation structure (e.g., Palo et al. 2004; Jehle et al. 2005, 
2007; Dogac et al. 2013), mating systems (e.g., Jones et al. 
2002; Schmeller et al. 2005; Steinfartz et al. 2006; Jehle 
et al. 2007; Loyau and Schmeller 2012), landscape (Storfer 
et al. 2010) and conservation genetics (reviewed by Jehle 
and Arntzen 2002; Beebee 2005; Schmeller and Merila 
2007; Duong et al. 2013). Moreover, the current impact 
of microsatellite loci as genetic tools is also illustrated by 
more than 4000 scientific studies that have been 
published in the past two years matching the query "mi- 
crosatellite loci" in the Web of Science and by recent 
publications reporting the development of new loci (i.e., 
Prunier et al. 2012; Castoe et al. 2012a,b; Dobes and 
Scheffknecht 2012). Before the application of NGS in the 
process of developing microsatellite loci, the isolation and 
characterization of new loci was a costly and sometimes 
elaborate endeavor (e.g., Zane et al. 2002). However, by 
using NGS approaches on genomic libraries enriched for 
microsatellite motifs, the isolation process has become 
much simpler and more cost-effective (e.g., Abdelkrim 
et al. 2009). Normally, these approaches result in tens of 
thousands of sequence reads, which are expected to lead 
to a large amount of suitable microsatellite loci (e.g., 
Yang et al. 2012). However, the correlation between the 
initial number of sequence reads obtained and the num- 
ber of usable polymorphic microsatellite loci may be low 
as the number of potentially amplifiable loci (PALs) is 
negatively influenced by many factors. These factors 
include sequence read quality (cut-off score values), motif 
length (type and number of repeat), and the presence, 
quality, and necessary length of the primer region, in 
addition to the amplification success and confirmed poly- 
morphism of loci across the studied populations. Accord- 
ingly, systematic approaches to estimate success rates of 
microsatellite loci development for quite distinct taxa are 
important to finally obtain a sufficient number of applica- 
ble loci (e.g., Castoe et al. 2010, 2012a,b; Prunier et al. 
2012). For example, in the copperhead snake (Agkistrodon 
contortrix), Castoe et al. (2010), isolated 4,564 PALs from 
128,773 reads, but only found 80 tetra-nucleotide PALs 
(i.e., 0.062% of all reads) with more than 10 repeat units. 
In the alpine newt (Ichthyosaura alpestris), Prunier et al. 
(2012) obtained 1015 microsatellite motif-bearing 



sequence reads, with a final yield of 14 microsatellite loci 
from 61 tested primer pair combinations. Microsatellite 
development might be especially tedious in amphibians 
due to their large genome sizes and comparably low num- 
bers of PALs, which have made development and isola- 
tion approaches in the past both cost- and time-intensive 
(e.g., Hendrix et al. 2010; Hauswaldt et al. 2012). Accord- 
ingly, NGS-based microsatellite loci development 
approaches should be efficient in obtaining a sufficient 
number of loci in such species (e.g., amphibians). 

In this study, we used a 454-sequencing approach with 
enriched libraries to develop highly polymorphic 
tetra-nucleotide microsatellite loci for three distinct newt 
species within the family of Salamandridae {Calotriton 
asper, Lissotriton helveticus, and Triturus cristatus). We 
determined the success rate of our approach by estimat- 
ing the number of PALs based on the number of usable 
polymorphic loci tested across several populations of each 
species and compared it with Illumina-based sequencing 
approaches of recently published studies. Furthermore, we 
tested the cross-amplification success rate of the devel- 
oped loci for C. asper in the highly endemic and threa- 
tened species C. arnoldi, the Montseny brook newt (see 
Carranza and Amat 2005). 

Materials and Methods 
Study species 

Triturus cristatus, the great crested newt (Fig. 1), is widely 
distributed from the United Kingdom to northern France, 
through southern Scandinavia to central Europe, and into 
a small part of the Balkans. The species is listed on the 
Habitats Directive of the European Union (92/43/EEC) 
and is threatened by exposure to fish, habitat loss, and 
habitat fragmentation (see Jehle et al. 2011; Denoel 2012; 
Denoel et al. 2013). Thus far, only eight applicable 
microsatellite loci for T. cristatus have been published 
(Krupa et al. 2002), which might be an insufficient num- 
ber to reveal consistent results in population genetic anal- 
yses (e.g., SPOTG software; Hoban et al. 2013). For the 
two endemic mountain brook newt species of the genus 
Calotriton (C. asper, the Pyrenean brook newt and C. ar- 
noldi, the Montseny brook newt, found in the northeast- 
ern Iberian Peninsula), no microsatellite loci have been 
reported thus far. Both species are endemic to comparable 
small ranges (especially C. arnoldi) and are habitat spe- 
cialists that are adapted to high mountain brooks and 
have a cryptic life history. C. asper is listed as near threa- 
tened (NT), and C. arnoldi is listed as critically endan- 
gered (CR) according to the IUCN Red List v3.1 (http:// 
www.iucnredlist.org/static/categories_criteria_3_l ) . There- 
fore, the development of microsatellite loci for these spe- 
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cies is an important contribution to efforts to better 
understand their ecology and evolution and will conse- 
quently assist in their conservation. L. helveticus, the 
palmate newt, is distributed throughout Western Europe 
and is in decline in some parts of Europe due to habitat loss 
and fragmentation (Denoel and Ficetola 2007). Our study 
adds additional microsatellite loci to the already existing set 
of eight loci for this species (Johanet et al. 2009). 

Sampling and DNA extraction 

Tissue samples of T. cristatus were collected from two 
different populations in the Kottenforst near Bonn and in 
the Latumer Bruch in Krefeld (North Rhine-Westphalia, 
Germany) (Table 1). Tissue samples of C. asper were 
collected from four different sampling sites from the 
Spanish and French sides of the Pyrenees (Table 1). Sam- 
ples of C. arnoldi were taken from five different locations 
that were divided into two main sectors (eastern and wes- 
tern) on both sides of the Tordera river valley in the 
Montseny massif (Spain) that were separated by inhospi- 
table habitat. Tissue samples of L. helveticus were col- 
lected from five sampling sites in the Larzac plateau 
(France) (Table 1). Samples were taken by clipping single 
toes or tail tips, with permission of the local administra- 
tive authorities, and stored in 80% ethanol. 

Total genomic DNA was extracted using the sodium- 
dodecyl-sulfate (SDS)-proteinase K/Phenol-Chloroform 
extraction method, after which it was stored in Tris- 
EDTA buffer (10 mmol/L Tris-HCl, 0.1 mmol/L EDTA, 
pH 8.0) and then used for all subsequent reactions. To 
start the enrichment procedure for the three distinct tar- 
get species (i.e., T. cristatus, L. helveticus, and C. asper) 
with more or less similar amounts of genomic DNA, we 
estimated DNA concentrations of different individuals on 
a 1% agarose gel and selected those with a concentration 



Table 1. Geographic locations of the sample sites for the different 
populations. For reasons of conservation, the locations of Calotriton 
arnoldi sample sites are intentionally not listed. 



Species 


Population 


Geographic coordinates 


Triturus cristatus 


Krefeld 


6°39'17"E, 51°19'05"N 




Kottenforst 


7°3'13"E, 50°40'24"N 


Calotriton asper 


Ibon Perramo 


0°29'59"E, 42°38'20"N 




Barranco Valdragas 


0°47'38"E, 42°51'29"N 




Ibon d'Acherito 


0°42'25"E, 42°52'46"N 




Bassies 


1 °24'58"E, 42°46'04"N 


Lissotriton helveticus 


Mas d'Aussel 


3°19'34"E, 43°58'30"N 




Campels North 


3°34'15"E, 43°57'39"N 




Bagnelade 


3°21'42"E, 43°51'20"N 




Coulet Northeast 


3°32'23"E, 43°49'15"N 




Le Cros Ferme 


3°22'10"E, 43°52'11"N 



of 100-200 ng//.iL, as suggested by Glenn and Schable 
(2005). 

Enrichment of microsatellite loci and 
454-sequencing 

The enrichment protocol followed the selective hybridiza- 
tion method with minor modifications (Zane et al. 2002; 
Glenn and Schable 2005); and the enrichment procedure 
was performed separately for each target species. Genomic 
DNA was digested into approximately 500 bp fragments 
using Rsa I enzyme and Xmn I to avoid linker dimeriza- 
tion. Double-stranded linkers were annealed to both ends 
of the fragments to obtain a primer-binding site 
for subsequent PCR. Linker sequences were as follows: 
SimpleXL03_U: 5 ' - AAAACGTGCTGCGG AACT- 3 ' and 
SimpleXL03_Lp 5'-pAGTTCCGCAGCACG-3'. PCR was 
performed in 25 /iL reactions to test whether annealing 
was successful; this PCR product was then used for the next 
step to increase the concentration of linker-ligated DNA. 
To capture DNA fragments containing microsatellite loci 
sequences out of all linker-ligated fragments, 50 jiL of 
Streptavidin M-280 Dynabeads (Invitrogen, Carlsbad, 
CA, USA) was used. To enrich for tetra-nucleotide motif- 
bearing DNA fragments, biotinylated oligo probes and lin- 
ker-ligated DNA fragments were mixed as described by 
Glenn and Schable (2005). For this step, the following oligo 
probes were used: (AAGT) 8 , (AGAT) 8 , (ACAT) 8 , (AAAT) 8 , 
(AACT) 8 , (AAAC) 8 , (AAAG) 6 , (AATC) 6 , (ACAG) 6 , 
(ACTC) 6 , (ACTG) 6 , (AATG) 6 , and (ACCT) 6 . PCR was 
performed to recover the microsatellite-enriched DNA 
fragments (Glenn and Schable 2005). After amplification, 
all samples were quantified using a Nanodrop spectropho- 
tometer. Afterward, the samples were processed according 
to the cDNA Rapid Library Preparation Method Manual 
(Roche, Mannheim, Germany) beginning with step 3.3 and 
omitting step 3.4. Multiplex Identifier (MID) Adaptors for 
Rapid Libraries (Roche, Branford, CT) were ligated to the 
DNA fragments of each sample (T. cristatus: MID ACA- 
CTACTCGT, MID ACGACACGTAT; C. asper: MID AC- 
GAGTAGACT, MID ACGCGTCTAGT; L. helveticus: MID 
ACGTACACACT, MID ACGTACTGTGT). The DNA frag- 
ments were cleaned and subsequently quantified using an 
Agilent 2100 Bioanalyzer. As a final step, the individual 
samples were combined into a DNA library pool, which 
was run on an Agilent 2100 Bioanalyzer prior to emulsion 
PCR and sequencing, as recommended by Roche. The 
library was not denatured prior to pipetting onto the 
washed capture beads (step 3.2.8, emPCR Method Manual 
- Lib-L SV, Roche, Branford, CT, USA). The library was 
subsequently sequenced on a 454 GS-FLX using Titanium 
sequencing chemistry. 
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Estimation of microsatellite loci success 
rates for the different species 

To estimate the number of PALs based on the numbers 
of polymorphic loci initially identified, we used steps A 
through I described below. (A) The PHOBOS software 
version 3.3.11 (Mayer 2007) was used to assign obtained 
sequence reads to a target species on the basis of the spe- 
cies-specific MID tags. PHOBOS was also used 
to identify sequence reads containing noninterrupted 
stretches of at least ten tetra-nucleotide repeat motifs. (B) 
Selected sequences were retained by PHOBOS only when 
the flanking region on each side of the repeat motif was 
at least 25 bp long. These sequences were then assessed 
by eye for their general suitability for primer design, that 
is, sequences with more than five repetitive nucleotides at 
a stretch were removed. (C) Score values, indicating the 
quality of each retrieved sequence, were assessed, and 
sequences with values below 20 (of a maximum score of 
40) were discarded from further analysis. (D) We 
designed primers using the software Primer 3 (version 
0.4.0, Rozen and Skaletsky 2000) with default settings 
(i.e., an optimum primer temperature of 60°C). (E) We 
tested all primer pairs for amplification success and sub- 
sequent degree of polymorphism and heterozygosity in at 
least 21 individuals. In a first step, a universal M13-tail 
was attached to the forward primer as a cost-reducing 
method (Schuelke 2000). (F) Only for the polymorphic 
microsatellite loci in C. asper and T. cristatus, we 



designed primers without the M13-tail but with fluores- 
cence labeling. These primers were tested in a 10 [A of 
Type-it multiplex PCR (Qiagen) containing 1 /(L of DNA 
for up to 902 individuals per microsatellite locus. Primers 
were combined in either three (C. asper) or two (T. crista- 
tus) multiplex mixes, supplemented by one mix of previ- 
ously published loci for T. cristatus (see Krupa et al. 2002). 
Applied PCR parameters were as follows: (1) an initial Taq 
polymerase activation step of 5 min at 95°C, (2) 30 s at 
94°C, (3) 90 s at an annealing temperature of 60°C, (4) 
60 s extension at 72°C, (5) steps 2-4 repeated for 30 cycles, 
and (6) a final extension phase of 30 min at 60°C. PCR 
products were diluted in 200 /iL of water, and 1 fA of 
diluted product was added to 19 /(L of Genescan 500-LIZ 
size standard (Applied Biosystems) prior to analysis on an 
ABI 3730 96-capillary automated DNA sequencer. (G) 
Analysis of the microsatellites was performed using GENE- 
MARKER software (version 1.95, SoftGenetics, State Col- 
lege). Tests for null alleles, deviations from Hardy- 
Weinberg equilibrium, and linkage disequilibrium were 
performed using CERVUS (version 3.0.3, Field Genetics 
Ltd., Marshall et al. 1998). (H) Primer pairs for C. asper 
were also tested for cross-amplification success in C. ar- 
noldi using the C. asper multiplex mixes with 41 C. arnoldi 
samples originating from two different sectors. Six individ- 
uals were tested from sector 1, and 36 individuals were 
tested from sector 2; PCR conditions were as described for 
C. asper. (I) The estimation of the number of PALs for each 
species was performed by extrapolating the number of suc- 



Table 2. Number of obtained sequence reads, tested primer pairs (NTPP), successfully isolated polymorphic loci (SIPL) and estimated potentially 
amplifiable loci PALs as well as corresponding calculated success rates for target species using an enrichment-based 454 next-generation sequenc- 
ing approach of this study. For comparison, three representative studies using an lllumina next-generation sequencing approach (according to 
Castoe et al. 2012b) are broken down in comparable units. 









Tetra-nucleotides 








454-approach 


Number of 


Tetra-nucleotides 


>10 repeats + 








(this study) 


sequence reads 


>10 repeats 


priming sites (>25bp) NTPP 


SIPL 


Estimated PALs - success rate (%) 


T. cristatus 


19,562 


936 


107 


41 


11 


29 (0.15) 


C. asper 


52,075 


1083 


316 


41 


20 


154 (0.30) 


L. helveticus 


55,626 


1434 


319 


22 


15 


217 (0.39) 






Number of 


PAL_FINDER Di, tri, 


Number of 


Number of 


Estimated PALs - success 


lllumina approach 




sequence reads 


tetra, penta, hexa 2 


tested loci 


polymorphic loci rate (%) 


Cyprinodon julimes/C. pachycephalus^ 


5 million 


1 590 loci 


48 


25 (C. julimes)/] 1 828 (0.01 6)/ 



(C. pachycephalus) 563 (0.01 1 2) 
Leptodea leptodon 3 5 million 3905 loci 48 16 1301(0.026) 

Ambystoma annulatum* 5 million 749 loci 5 150 22 109 (0.00219) 

1 Carson et al. (2013). 
2 Castoe et al. (2012a,b). 
3 0'Bryhim et al. (2012). 
4 Peterman et al. (2013). 

5 Note that the study of Peterman et al. (2013) only analyzed tetra- and penta nucleotide motifs. 



3950 



© 201 3 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd. 



A. Drechsler ef al. 



NGS Based Microsatellite Loci Development 



cessfully isolated polymorphic loci (SIPL) in relation to the 
number of tested primer pairs (NTPP). That rate was then 
used to calculate the number of PALs for sequence reads 
that passed the criteria described in (B) and (C). Here, an 
example calculation is provided. We isolated 11 SIPL of 41 
NTPP for T. cristatus, resulting in a SIPL/NTPP ratio of 
26.83%. Extrapolating this ratio to the 107 sequences 
that passed criteria (B) and (C) resulted in 29 PALs. Thus, 
the overall success rate (PALs/number of sequences) for 
T. cristatus was 0.15% (see Table 2). 

Results 

The 454-sequencing resulted in a total of 127,263 
sequence reads from one quarter run for the enriched 
libraries of the three species. Sequencing results and 
success rates of microsatellite loci development are sum- 
marized in Table 2. The species classification by MID tag 
identification of all sequences performed by PHOBOS 
(Methods, step A) led to 19,562 sequences for T. cristatus 
(15.37% of all sequences), 52,075 sequences for C. asper 
(40.92%), and 55,626 sequences for L. helveticus 
(43.71%). PHOBOS identified 936 sequences (4.78% of 
all T. cristatus reads) containing a noninterrupted stretch 
of ten or more tetra-nucleotide repeat motifs in T. crista- 
tus, 1083 (2.08% of all C. asper reads) sequences in 
C. asper, and 1434 (2.58% of all L. helveticus reads) 
sequences in L. helveticus. The scan for suitable PCR 
priming sites (i.e., 25 bp in minimum in each direction) 
performed in PHOBOS (Methods, step B) resulted in 107 
sequences for T. cristatus, 316 sequences for C. asper, and 
319 sequences for L. helveticus (Table 2). The assessment 
for nonrepetitive flanking regions and the exclusion of 
sequences with score values of poor quality (Methods, steps 
C and D) led to 41 ordered primer pairs for T. cristatus, 41 
for C. asper, and 22 for L. helveticus. In T. cristatus, 14 of 
41 tested loci were polymorphic (Methods, step E) and 
were sequentially labeled with fluorescence dyes. In 
C. asper, 21 of the 41 tested loci were polymorphic, while 
15 of the 22 tested loci were polymorphic in L. helveticus. 
While the 15 L. helveticus loci were analyzed without any 
additional labeling (Methods, step G), the Type-it multi- 
plex PCR of the T. cristatus and C. asper loci resulted in 
the detection of 13 (T. cristatus) and 21 (C. asper) poly- 
morphic loci (Methods, step F). Validation of these candi- 
date loci (Methods, step G) yielded 1 1 loci for T. cristatus, 
20 loci for C. asper, and 15 loci for L. helveticus (summa- 
rized in Tables 3-5). Detailed information on the number 
of tested individuals per population, the expected and 
observed heterozygosity, tests for deviations from Hardy- 
Weinberg equilibrium with a Bonferroni correction, 
linkage disequilibrium, and report of null alleles is provided 
for each species in supplementary Tables S1-S3. The test 
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for cross-amplification success of C. asper primer pairs in 
C. arnoldi (Methods, step H) resulted in 10 polymorphic 
loci (Table 4). The SIPL/NTPP ratios (Methods, step I) 
for T. cristatus, C. asper, and L. helveticus were 26.83%, 
48.78%, and 68.18%, respectively. The adoption of this 
ratio to calculate the number of PALs resulted in 29 tetra- 
nucleotide PALs for T. cristatus, 154 for C. asper, and 217 
for L. helveticus. The PALs/number of sequences ratio was 
0.15% for T. cristatus, 0.30% for C. asper, and 0.39% for 
L. helveticus (Table 2). 

Discussion 

Comparison of success rates of NGS-based 
microsatellite loci development 

For many nonmodel organisms, the de novo development 
of microsatellite loci has been enormously improved by 
the implementation of NGS approaches. In the past, the 
development of microsatellite loci for amphibian species 
using classic cloning approaches was rather time-consum- 
ing and costly, possibly due to their large genomes, which 
are comparably rich for long repetitive DNA stretches 
resembling in part microsatellite motifs (e.g., in salaman- 
ders; see fig. 1.1 in Steinfartz 2003). Based on our experi- 
ences, only the use of enrichment procedures (see Zane 
et al. 2002) enabled the development of a sufficient num- 
ber of polymorphic microsatellite loci applicable for 
genetic studies in various amphibian target species (e.g., 
Steinfartz et al. 2004; Hauswaldt et al. 2008, 2012; Hen- 
drix et al. 2010). Here, we employed an NGS approach to 
sequence genomic sublibraries enriched for tetra-nucleo- 
tide motifs of three newt species, for which only a limited 
number (in the case of T. cristatus and L. helveticus) or 
no (C. asper) loci had previously been available. Although 
NGS approaches have certainly improved the develop- 
ment of new microsatellite loci, many studies using this 
approach do not report actual success rates of applicable 
loci compared with the large number of sequencing reads 
initially obtained (e.g., Gardner et al. 2011). This might 
lead to the impression that, by using NGS approaches, 
very high numbers of new loci can easily be developed. 
However, the pure occurrence of a microsatellite locus 
motif in a sequencing read does not guarantee that this 
locus can be developed into an applicable polymorphic 
locus for subsequent genetic analyses. Low sequence read 
quality, motif length (type and number of repeat), and 
the presence and appropriate length of the surrounding 
primer region are major factors that can dilute the fraction 
of potentially amplifiable loci (PALs) enormously. Our aim 
was therefore to develop high-quality tetra-nucleotide 
motif-bearing microsatellite loci with a demonstrated util- 
ity for subsequent genetic analyses of respective target 
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Table 3. Characterization of the full set of 17 applied microsatellite loci for Triturus cristatus, including the 1 1 newly developed primer pairs from 
this study (highlighted in bold) along with six previously published loci (Krupa et al. 2002). Loci are grouped by the multiplex combinations used 
for amplification. Information on the locus name, primer sequence, direction (F is forward, R is reverse), annealing temperature of the primer for 
PCRs, microsatellite motif, amplified fragment size range, number of alleles, and labeling dye are provided together with the accession number of 
the associated GenBank sequence. 



Locus name Primer sequence (5-3') 



Size range of 

Annealing Repeat motif amplification Number Fluorescence GenBank 
temp. (°C) of cloned alleles product of alleles labeling accession 



GTGATGGTTGCCAAGC 60°C 
GATCCAAGACACAGAATATTTAG 

GATCCACTATAGTGAAAATAAATAATAAG 60°C 
CAAGTTAGTATATGATATGCCTTTG 

CGAGTTGCCCAGACAAG 60°C 
GATCACATGCCCATGGA 

CCAACTGGTATGGCATTG 60°C 
GATCACAGAAACTCTGAATATAAGC 

GATCATCTGAATCCCTCTG 60°C 
ATAC ATTC ATG AC GTTTG G 

CAAGTTTCCTCTGAAGCCAG 60°C 
GTTTCTTGCCTGACAAAGTAATGCTTC 



(GT) 36 

Interrupted 
(GAAA) 27 



93-131 
241-288 
(TTTC) 2 2(CA) 1 , 289-340 



(GAAA) 32 

Interrupted 
(GAAA) 36 

Interrupted 
(TTTC) 23 



200-234 
214-320 
253-311 



10 
18 
11 
10 
24 
15 



FAM 

FAM 

NED 

NED 

VIC 

PET 



AJ292500 
AJ292517 
AJ292505 
AJ292490 
AJ292491 
AJ292494 



GCGGATACATGGTCTTCGTT 

TTCAGTTAAAAGTGTCCTCTGTGG 

GGCTCTTCGACTGAATGGAG 

CGGTCAATTGGTTGTAGCAG 

CCTTTGTACACCACTGGCAAA 

TGGTCCTATAAAGCCATCTTGG 

AAAGTGCACTCTTTCTCTGAAGC 

TGCAAAGTGCATGTGTGACT 

GGGTTGCAAAGCACCTTAAT 

TACCTGGGTCCTCCTCCAAG 

TTTAGTCTCTCCGCTCTGCAA 

AGCGGAATCTGCCTTATGGT 



60°C 
60°C 
60°C 
60°C 
60°C 
60°C 



(ACTC) 18 
(ATTG) 17 
(ATCC) 18 
(ATCC) 24 
(ACAT) 14 
(AATC) 13 



1 77-268 
1 90-206 
218-250 
130-196 
216-232 
124-160 



26 
6 
8 

13 
6 

10 



PET 

VIC 

FAM 

FAM 

VIC 

VIC 



KF442195 
KF442196 
KF442197 
KF442198 
KF442199 
KF442200 



ACAGGCAGTGCGAAAGAAAG 

CTGACCCAAGACCACCTCTC 

AGGTAGCCTTCCGCCACTAT 

GCTTGATCCTGGCATGAAAT 

C C G C C AATC AG C A ATATTTA 

AGTGGAAGCACCTGCTGAAG 

TCTGTGACATGTCCTGATAGTGAA 

TAGCACCATGAGACCCTCAC 

GTTAGACCTCGCATCTGTTGG 

CCTCAAGACCTGGCTCTACG 



60°C 
60°C 
60°C 
60°C 
60°C 



(AATC) 7 

(AGAT) 13 

(ACAT)„ 

(AATC) 13 

(AATC)„ 



200-204 
168-188 
179-199 
181-213 
161-169 



NED 

NED 

PET 

FAM 

VIC 



KF442201 
KF442202 
KF442203 
KF442204 
KF442205 



species. Based on the number of obtained polymorphic 
loci, we extrapolated species-specific success rates, which 
were found to be quite low, that is, below one percent (see 
Table 2). 

Although our success rates seem to be unexpected low, 
they are in line with numerous other studies, from which 
success rates are reported or can be calculated. Using 454- 
sequencing technology but no specific enrichment proto- 
col, Castoe et al. (2010) identified 80 tetra- nucleotide 
PALs (0.06% of total sequencing reads) with more than 10 



motif repeats for the copperhead snake {Agkistrodon con- 
tortrix). In a parallel study in the coral snake (Micrurus 
fulvius), they were able to identify 54 tetra-nucleotide 
PALs (0.20% of total sequencing reads) with more than 10 
motif repeats (Castoe et al. 2012a). Our success rate of 
0.15% for T. cristatus (29 PALs) was similarly low. In con- 
trast, the estimated 154 PALs for C. asper with a success 
rate of 0.30% and 217 PALs for L. helveticus with a success 
rate of 0.39% were considerably higher. When applying 
more relaxed comparison criteria between studies, 454- 
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Table 4. Characterization of the full set of 20 applied microsatellite loci for Calotriton asper. Loci are grouped by multiplex combinations used 
for amplification. Locus name, primer sequence, direction (F is forward, R is reverse), annealing temperature of the primer for PCRs, microsatellite 
motif, amplified fragment size range, number of alleles, labeling dye, and GenBank accession number are provided. The number of alleles of 
C. asper microsatellite loci detected in C. arnoldi cross-amplification is also provided; polymorphic loci are highlighted in bold. 











Repeat 




















motif of 


Size range of 




Number of 




GenBank 








Annealing 


cloned 


amplification 


Number 


alleles in 


Fluorescence 


accession 


Locus name 


Primer sequence (5-3') 


temp. (°C) 


allele 


product 


of alleles 


C. arnoldi 


labeling 


numbers 


Multiplex 1 




















Ca1 


F 

R 


TGGAACAGATGGCGTTGTAA 
TTCCTGCAACCTCCTTGTCT 


60°C 


(AGAT) 16 


158-170 


4 


3 


FAM 


KF442206 


Ca3 


F 
R 


CCATGCATTCTTGGAGGTTT 
TTCAAAGGCAGTGTTTCAGG 


60°C 


(AGAT) 15 


246-288 


5 


6 


FAM 


KF442207 


Ca7 


F 
R 


ACCCTTACACACCCCAAACC 
GTTCCCTGCATGGCTCTAAA 


60°C 


(AGAT) 16 


232-264 


6 


5 


NED 


KF442208 


Ca21 


F 
R 


AGCGTGTGCAGCAGTATCC 
GCAATGTGCCATTCATTACC 


60°C 


(AGAT) 12 


234-266 


7 


5 


VIC 


KF442209 


Ca22 


F 

R 


CTTCAGACTGCCGAGTGTTG 
ACCTTGTCACGGTGTAGGAAG 


60°C 


(AGAT) 13 


140-144 


2 


5 


PET 


KF442210 


Ca24 


F 
R 


GTGATGTCATGTGCGAGGTC 
GGACCTATGTAAATAGCCCACCT 


60°C 


(AGAT) 15 


1 64-1 80 


4 


1 


NED 


KF442211 


Us7 


F 
R 


/ — r r~ r~ a f~ r~ r* a t — r a a nr r~ K r~ a 

C 1 bCACCGAI 1 AAI 1 GLAGA 
CTGCACCACTCGCTCCTC 


60°C 


(ACAT) 16 


234-242 


5 


4 


PET 


KF442212 


Multiplex 2 




















Ca8 


F 
R 


AGAAGGGAGTCAGGCAGACA 
GGAGGATCAAATGTGTTTGGA 


60°C 


(AGAT) 13 


1 74-1 82 


3 


2 


FAM 


KF442213 


Ca16 


F 
R 


GGCAACAATGATGGGTATGC 

a r~ r~ r~ r~ at /~ at 1 r~ ata s~~r s~ t — r 

ACCGCATGCATGATAGTGCT 


60°C 


(AGAT) 20 


119-131 


4 




FAM 


KF442214 


Ca23 


F 

R 


CGTGCCTGAAACCTATGG 
TTGCTTCACCTCATCCACTG 


60°C 


(AGAT) 14 


226-266 


9 




PET 


KF442215 


Ca38 


F 
R 


CCTGTTAGGTGAAGGTGAGCA 
CTGGTAGCCATGCGCTTTAT 


60°C 


(AATG) 12 


166-178 


3 




VIC 


KF442216 


Us2 


F 
R 


TGGGCTGAAGGATTGAAAAA 
CTCAGCTGCAGTGGTGTGTT 


60°C 


(AGAT) 17 


242-250 


3 


6 


VIC 


KF442217 


Us3 


F 
R 


AAGTTTGTAGGTATGCATAATAGCC 
GGAAGTCCAGGCCTGTAGAC 


60°C 


(AGAT) 16 


184-192 


3 


3 


NED 


KF442218 


Multiplex 3 




















Ca5 


F 
R 


CGTTATTGTCGTGTGGATGG 
TGCTAGTGTAGATCCCTTCATCG 


60°C 


(ACAT) 10 


218-222 


2 


_ 


VIC 


KF442219 


Ca20 


F 
R 


CAGCGGTAATACCATCAGGA 
CCACAGATCCTTCTGCAACA 


60°C 


(AGAT) 15 


200-232 


10 




FAM 


KF442220 


Ca25 


F 
R 


CCTTTGTCCCTGTTCAGTGC 
TTTGCAGATGCATTGTGTGA 


60°C 


(ACAT) 14 


168-172 


2 


2 


PET 


KF442221 


Ca29 


F 
R 


TC C ATAAG C C ATTATTGTGTG C 
AGTGCACTGCCTCAGCATGT 


60°C 


(AATC) 10 


246-258 


4 


1 


PET 


KF442222 


Ca30 


F 
R 


TCACACATCATGCAGCTTACC 
GACCCTCATGGGTGTGTAGC 


60°C 


(AATC) 10 


108-120 


3 




VIC 


KF442223 


Ca32 


F 

R 


ACAGGGCAAGAGAGTCAACG 
CAGCCTATTGGCTTGTCAGC 


60°C 


(ACAG) 10 


148-200 


6 


4 


NED 


KF442224 


Ca35 


F 


GGCGCTTTACAAGTGCTACC 


60°C 


(ACTC) 14 


126-166 


8 




FAM 


KF442225 



R: CTGCCACAAGGTAGAGGTCA 



based microsatellite loci development in the meadow viper 
resulted in only 14 applicable loci out of 37,000 sequence 
reads (0.037% success rate) and in a success rate of only 
0.007% in the Asp viper (Geser et al. 2013) - both studies 
were performed without prior enrichment of genomic 
sublibraries. Accordingly, our obtained success rates are 



comparable quite high and seem to justify the applied 
enrichment procedure. 

There is no doubt that Illumina sequencing is by far 
more cost-effective than 454-sequencing. Castoe et al. 
(2012b) suggest that Illumina-based microsatellite loci 
development is by far more effective than 454-based 
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Table 5. Characterization of the full set of 15 applied microsatellite loci for Lissotriton helveticus. Locus name, primer sequence, direction (F is 
forward, R is reverse), annealing temperature of the primer for PCRs, microsatellite motif, amplified fragment size range, number of alleles and 
GenBank accession number are provided. 









Repeat motif 


Size range of 










Annealing 


of cloned 


amplification 


Number 


GenBank 


Locus name 


Primer sequence (5-3') 


temp. (°C) 


allele 


product 


of alleles 


accession 


Lh1 


F: CAGCTGCAAGCGACGAAG 
R: GTTCACACGGATTTGGTTGG 


60°C 


(AGTG) 20 


1 56-228 


10 


KF442226 


Lh2 


F: TGGCAGGAGAGAGGTTTCAT 
R: TTGGGACCCTACGGGTAAGT 


60°C 


(ATGT) 12 


1 54-226 


6 


KF442227 


Lh6 


F: CTGGTGATGTGCTCAGGAGA 
R: GGAACTGCTTCAATGCCTCT 


60°C 


(ATAG) 12 


1 56-228 


9 


KF442228 


Lh7 


F: AACATTCCACGCTGTCATCA 
R: GGTCACCGTGC G CTTTATTA 


60°C 


(AATG) 10 


185-189 


2 


KF442229 


Lh9 


F: GCACATGGTGGAGCTTCAAA 

R: GACTTGACTGGACCTACTAGTGACA 


60°C 


(AGAT) 10 


1 72-248 


13 


KF442230 


Lh12 


F: CTCATTACCAAGTCCTGCTTTG 
R: GGTCGGCTCTTTGTTGCTAA 


60°C 


(AG AT), 9 


164-176 


4 


KF442231 


Lh13 


F: GTCCCCACAGCGTGTGTTAT 
R: CCTCCTGCAGTCCACACC 


60°C 


(AACT) 16 


188-208 


5 


KF442232 


Lh14 


F: GCAACATCCTCACGTTCTGA 
R: AGCGCATTTAGACCCTCACA 


60°C 


(AATC)„ 


216-244 


5 


KF442233 


Lh16 


F: TACAGCCTCAGCCATTCACA 

R: TGATGAGATGCGCTCTATAAATAC 


60°C 


(AATC) 13 


130-142 


4 


KF442234 


Lh17 


F: GCACATGGTGGAGCTTCAAA 

R: GACTTGACTGGACCTACTAGTGACA 


60°C 


(AGAT) 10 


175-215 


4 


KF442235 


Lh1 8 


F- GCGCC AGGATACTCTCAAGT 
R: CAATGGTGAAGGAAGGGCTA 


60°C 


(AC AT), . 


1 37-1 57 


5 




Lh19 


F: CAGTTGTCGCTGGAGGTTG 

R: CTGCCAGTTCCTAGATACACTCA 


60°C 


(AATC) 10 


191-215 


6 


KF442237 


Lh44 


F: TTTGAGGGACACAACTGATTTT 
R: CTCGCCTTCAGGAGACAACT 


60°C 


(AATC) 15 


231-259 


6 


KF442238 


Us4 


F: CCATCCTTCCGAGCTCAATA 
R: TGGGATGGTGTGTCTAAGGTG 


60°C 


(AGAT) 23 


1 96-204 


3 


KF442239 


Us9 


F: TGGATACCCTGTCAGGTGATTA 
R: TGCAAGACAGAAGGCTGACA 


60°C 


(ATTG) 15 


136-186 


6 


KF442240 




Figure 1. Male of the crested newt (Triturus cristatus), one of the 
target species for which microsatellite loci have been developed 
(photograph by B. Thiesmeier). 



approaches. In their comparative analysis, they obtain quite 
high success rates (called discrete PAL rate) ranging from 
37-50% for both Illumina- and 454-based approaches, 
respectively. However, one important drawback of this 
study was that loci were not specifically tested for final 
performance and success rates might be therefore strongly 
overrated. As Illumina sequencing is now commonly 
applied for microsatellite loci development, we tried to 
estimate obtained success rates of three studies representa- 
tive for quite diverse organisms such as fish (Carson et al. 
2013), Bivalva (O'Bryhim et al. 2012) and salamanders 
(Peterman et al. 2013). Although Illumina sequencing 
resulted in higher number of suitable loci, final success 
rates were one order smaller than obtained success rates of 
the combined enrichment-454-sequencing approach (see 
Table 2). Also here, the enrichment for certain microsatel- 
lite loci motifs (e.g., tetra-nucleotides as in our approach) 



3954 © 201 3 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd. 



A. Drechsler ef al. 



NGS Based Microsatellite Loci Development 



seems to be highly efficient when compared to pure Illu- 
mina-based sequencing as evidenced by comparing success 
rates of our study with the one of Peterman et al. (2013). 

Implications for the use of new 
microsatellite loci for endangered 
amphibian species 

The newly developed 11 (X. cristatus), 20 (C. asper), and 
15 {L. helveticus) polymorphic tetra-nucleotide microsatel- 
lite loci will be a tremendous help in furthering our knowl- 
edge of the population biology of these locally endangered 
species, finally building a basis for improved conservation 
measures. For T. cristatus, for example, the set of 19 appli- 
cable microsatellite loci will facilitate a more detailed iden- 
tification of population structure, dispersal, and migration 
rates across small geographic scales and will even allow for 
the genetic assignment of single individuals to populations 
with high credibility. In addition, estimates of effective 
population sizes of subpopulations or even populations 
from single ponds can now be identified with much higher 
resolution. The large number of newly developed applica- 
ble loci for C. asper sets the foundation for revealing the 
interesting population biology of this cryptic endemic 
mountain species, as well as that of its sister species C. ar- 
noldi. In particular, new insights into dispersal propensity 
gained by genetic estimates will be important for elucidat- 
ing population connectivity, the extent of single popula- 
tions, the most common reproductive strategies, and how 
life-history traits relate to individual genotypes. Also, it 
can be tested whether the unexpected high genetic differ- 
entiation of C. asper populations based on AFLP markers 
(Mila et al. 2009) is corroborated by microsatellite loci. 

Previous genetic studies in C. arnoldi suggested that 
eastern and western populations belonged to two evolu- 
tionary significant units (ESU's; Valbuena-Urena et al. 
2013) and proposed the maintenance of a breeding pro- 
gram for individuals from both units separately. Further 
genetic studies using microsatellite loci markers to infer 
the genetic diversity of the species, the current gene flow 
among population and their possible isolation are 
urgently needed to evaluate the conservation status more 
precisely. With the ten C. asper microsatellite loci that 
successfully cross-amplified in C. arnoldi, we will be able 
to study the structure of different populations within the 
Montseny species range in much greater detail. 

For L. helveticus, the new loci will be particularly use- 
ful for understanding the distribution and success of 
alternative phenotypes within the species range. Indeed, 
L. helveticus is one of the three European newt species in 
which facultative paedomorphosis is most regularly 
reported (Denoel 2007; Denoel et al. 2009). This process 
results in the retention of larval traits in adults in part of 



a population, while other individuals metamorphose into 
the terrestrial morph. Dimorphic populations are particu- 
larly common in southern France, where the highest rate of 
dimorphism is observed in an area that covers only 0.5% of 
the species range (Denoel 2007). The 23 microsatellite loci 
now available for this species will be useful for testing 
evolutionary hypotheses based on gene flow among paedo- 
morphic and metamorphic individuals (see Denoel 2002). 

Conclusion 

The use of NGS (454 and Illumina sequencing) strongly 
facilitated the development of microsatellite loci. How- 
ever, from most studies, it is unclear how effectively new 
microsatellite loci can be developed from the large num- 
ber of sequencing reads obtained from NGS. Our com- 
parative study on three distinct amphibian newt species 
demonstrates that, despite low overall success rates, the 
combination of enrichment protocols and NGS can result 
in considerably higher numbers of polymorphic tetra- 
nucleotide microsatellite loci. Our study draws a more 
realistic picture of the efficiency of microsatellite loci 
development in amphibian species and shows that 454- 
based microsatellite loci development is still competitive 
with Illumina-based approaches. 
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